Parallel Data Cube Construction: Algorithms, Theoretical Analysis, and Experimental Evaluation
نویسندگان
چکیده
Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. This paper presents two new algorithms for parallel data cube construction, along with their theoretical analysis and experimental evaluation. Our work is based upon a new data-structure, called the aggregation tree, which results in minimally bounded memory requirements. An aggregation tree is parameterized by the ordering of dimensions. We prove that the same ordering of the dimensions minimizes both the computational and communication requirements, for both the algorithms. We also describe a method for partitioning the initial array, which again minimizes the communication volume for both the algorithms. Experimental results further validate the theoretical results.
منابع مشابه
Impact of Data Distribution, Level of Parallelism, and Communication Frequency on Parallel Data Cube Construction
Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. We have developed a set of parallel algorithms for data cube construction using a new data structure called aggregati...
متن کاملParallel Construction of Data Cubes on Multi-Core Multi-Disk Platforms
On-line Analytical Processing (OLAP) has become one of the most powerful and prominent technologies for knowledge discovery in VLDB (Very Large Database) environments. Central to the OLAP paradigm is the data cube, a multi dimensional hierarchy of aggregate values that provides a rich analytical model for decision support. Various sequential algorithms for the efficient generation of the data c...
متن کاملComputing Partial Data Cubes ∗
The precomputation of the different views of a data cube is critical to improving the response time of data cube queries for On-Line Analytical Processing (OLAP). However, the user is often not interested in the set of all views of the data cube but only in a certain subset of views. In this paper, we study the problem of computing the partial data cube, i.e. a subset of selected views in the l...
متن کاملCube-Lifecycle Management and Applications
A common operation involved with the majority of algorithms relevant to On-Line Analytical Processing is aggregation, which can be extremely time-consuming if applied over large datasets. To overcome this drawback, scientists have proposed the precomputation and materialization of a large volume of aggregated data into a structure called data cube. Nevertheless, the construction and usage of th...
متن کاملParallel data cube construction for high performance on-line analytical processing
Decision support systems use On-Line Analytical Processing (OLAP) to analyze data by posing complex queries that require diierent views of data. Traditionally , a relational approach (ROLAP) has been taken to build such systems. More recently, multi-dimensional database techniques (MOLAP) have been applied to decision-support applications. Data is stored in multi-dimensional arrays which is a n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003